235 research outputs found

    Real-time model-based slam using line segments

    Get PDF
    Abstract. Existing monocular vision-based SLAM systems favour interest point features as landmarks, but these are easily occluded and can only be reliably matched over a narrow range of viewpoints. Line segments offer an interesting alternative, as line matching is more stable with respect to viewpoint changes and lines are robust to partial occlusion. In this paper we present a model-based SLAM system that uses 3D line segments as landmarks. Unscented Kalman filters are used to initialise new line segments and generate a 3D wireframe model of the scene that can be tracked with a robust model-based tracking algorithm. Uncertainties in the camera position are fed into the initialisation of new model edges. Results show the system operating in real-time with resilience to partial occlusion. The maps of line segments generated during the SLAM process are physically meaningful and their structure is measured against the true 3D structure of the scene.

    Improving 3D Keypoint Detection from Noisy Data Using Growing Neural Gas

    Get PDF
    3D sensors provides valuable information for mobile robotic tasks like scene classification or object recognition, but these sensors often produce noisy data that makes impossible applying classical keypoint detection and feature extraction techniques. Therefore, noise removal and downsampling have become essential steps in 3D data processing. In this work, we propose the use of a 3D filtering and down-sampling technique based on a Growing Neural Gas (GNG) network. GNG method is able to deal with outliers presents in the input data. These features allows to represent 3D spaces, obtaining an induced Delaunay Triangulation of the input space. Experiments show how the state-of-the-art keypoint detectors improve their performance using GNG output representation as input data. Descriptors extracted on improved keypoints perform better matching in robotics applications as 3D scene registration

    Real-time gaze estimation using a Kinect and a HD webcam

    Get PDF
    In human-computer interaction, gaze orientation is an important and promising source of information to demonstrate the attention and focus of users. Gaze detection can also be an extremely useful metric for analysing human mood and affect. Furthermore, gaze can be used as an input method for human-computer interaction. However, currently real-time and accurate gaze estimation is still an open problem. In this paper, we propose a simple and novel estimation model of the real-time gaze direction of a user on a computer screen. This method utilises cheap capturing devices, a HD webcam and a Microsoft Kinect. We consider that the gaze motion from a user facing forwards is composed of the local gaze motion shifted by eye motion and the global gaze motion driven by face motion. We validate our proposed model of gaze estimation and provide experimental evaluation of the reliability and the precision of the method

    There Is More Than One Way to Get Out of a Car: Automatic Mode Finding for Action Recognition in the Wild

    Get PDF
    Actions in the wild” is the term given to examples of human motion that are performed in natural settings, such as those harvested from movies [10] or the Internet [9]. State-of-the-art approaches in this domain are orders of magnitude lower than in more contrived settings. One of the primary reasons being the huge variability within each action class. We propose to tackle recognition in the wild by automatically breaking complex action categories into multiple modes/group, and training a separate classifier for each mode. This is achieved using RANSAC which identifies and separates the modes while rejecting outliers. We employ a novel reweighting scheme within the RANSAC procedure to iteratively reweight training examples, ensuring their inclusion in the final classification model. Our results demonstrate the validity of the approach, and for classes which exhibit multi-modality, we achieve in excess of double the performance over approaches that assume single modality

    Purposive sample consensus: A paradigm for model fitting with application to visual odometry

    Full text link
    © Springer International Publishing Switzerland 2015. ANSAC (random sample consensus) is a robust algorithm for model fitting and outliers' removal, however, it is neither efficient nor reliable enough to meet the requirement of many applications where time and precision is critical. Various algorithms have been developed to improve its performance for model fitting. A new algorithm named PURSAC (purposive sample consensus) is introduced in this paper, which has three major steps to address the limitations of RANSAC and its variants. Firstly, instead of assuming all the samples have a same probability to be inliers, PURSAC seeks their differences and purposively selects sample sets. Secondly, as sampling noise always exists; the selection is also according to the sensitivity analysis of a model against the noise. The final step is to apply a local optimization for further improving its model fitting performance. Tests show that PURSAC can achieve very high model fitting certainty with a small number of iterations. Two cases are investigated for PURSAC implementation. It is applied to line fitting to explain its principles, and then to feature based visual odometry, which requires efficient, robust and precise model fitting. Experimental results demonstrate that PURSAC improves the accuracy and efficiency of fundamental matrix estimation dramatically, resulting in a precise and fast visual odometry

    A visual category filter for Google images

    Get PDF
    We extend the constellation model to include heterogeneous parts which may represent either the appearance or the geometry of a region of the object. The pans and their spatial configuration are learnt simultaneously and automatically, without supervision, from cluttered images. We describe how this model can be employed for ranking the output of an image search engine when searching for object categories. It is shown that visual consistencies in the output images can be identified, and then used to rank the images according to their closeness to the visual object category. Although the proportion of good images may be small, the algorithm is designed to be robust and is capable of learning in either a totally unsupervised manner, or with a very limited amount of supervision. We demonstrate the method on image sets returned by Google's image search for a number of object categories including bottles, camels, cars, horses, tigers and zebras

    Unnecessary Image Pair Detection for a Large Scale Reconstruction

    Full text link
    corecore